Goto

Collaborating Authors

 make use




Volumetric Correspondence Networks for Optical Flow

Neural Information Processing Systems

Many classic tasks in vision -- such as the estimation of optical flow or stereo disparities -- can be cast as dense correspondence matching. Well-known techniques for doing so make use of a cost volume, typically a 4D tensor of match costs between all pixels in a 2D image and their potential matches in a 2D search window.


Scalars are universal: Equivariant machine learning, structured like classical physics

Neural Information Processing Systems

There has been enormous progress in the last few years in designing neural networks that respect the fundamental symmetries and coordinate freedoms of physical law. Some of these frameworks make use of irreducible representations, some make use of high-order tensor objects, and some apply symmetry-enforcing constraints. Different physical laws obey different combinations of fundamental symmetries, but a large fraction (possibly all) of classical physics is equivariant to translation, rotation, reflection (parity), boost (relativity), and permutations. Here we show that it is simple to parameterize universally approximating polynomial functions that are equivariant under these symmetries, or under the Euclidean, Lorentz, and Poincaré groups, at any dimensionality $d$. The key observation is that nonlinear O($d$)-equivariant (and related-group-equivariant) functions can be universally expressed in terms of a lightweight collection of scalars---scalar products and scalar contractions of the scalar, vector, and tensor inputs. We complement our theory with numerical examples that show that the scalar-based method is simple, efficient, and scalable.


Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

Neural Information Processing Systems

We challenge a common assumption underlying most supervised deep learning: that a model makes a prediction depending only on its parameters and the features of a single input. To this end, we introduce a general-purpose deep learning architecture that takes as input the entire dataset instead of processing one datapoint at a time. Our approach uses self-attention to reason about relationships between datapoints explicitly, which can be seen as realizing non-parametric models using parametric attention mechanisms. However, unlike conventional non-parametric models, we let the model learn end-to-end from the data how to make use of other datapoints for prediction. Empirically, our models solve cross-datapoint lookup and complex reasoning tasks unsolvable by traditional deep learning models. We show highly competitive results on tabular data, early results on CIFAR-10, and give insight into how the model makes use of the interactions between points.



43e4e6a6f341e00671e123714de019a8-AuthorFeedback.pdf

Neural Information Processing Systems

We appreciate the reviewer's valuable comments, and we were glad to read the positive comments regarding the We also appreciate the thorough feedback for further improvements. What is trained in the PRE-approach? Is there benefit in using the differentiable PDE solver? Do steps of a differentiable simulator correspond to time steps? Y es, in our text "step" typically means time step.




revised version of the paper

Neural Information Processing Systems

We would like to thank the reviewers for their comments and suggestions. Figure 7 in Appendix F.3, and this is likely to increase their individual utility in the long term. We will clarify this in the revised version of the paper. We will fix the statement of Proposition 4. The "strategic setting" refers to a scenario in which individuals who are subject to (semi)-automated A counterfactual is a statement of how the world would have to be different for a desirable outcome to occur [13]. We will clarify this in the revised version of the paper.